-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Flowlogs: make calico-node to fetch flow logs from a node #10144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass, want to take a closer look yet
felix/collector/goldmane/client.go
Outdated
@@ -48,6 +53,9 @@ func NewReporter(addr, cert, key, ca string) (*GoldmaneReporter, error) { | |||
return &GoldmaneReporter{ | |||
address: addr, | |||
client: cli, | |||
|
|||
// Do not send flowlogs to node socket, if goldmane address set via FelixConfig is equal to node socket | |||
mayReportToNodeSocket: addr != NodeSocketAddress, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much work would it be to lift this out of the Goldmane reporter and into its own Reporter, and then adding the ability to run multiple reporters + dynamically add / remove them?
It seems to me that we are going to want the ability to have multiple Reporters running at once - it would be cool to be able to use this e.g., even if Goldmane wasn't configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not much! Collector already supports multiple reporters.
But if we want to be able to run this without goldmane enabled, we should introduce a config option to enabled/disable it. Something like FlowLogsGoldmaneNodeServer
with Enabled
and Disabled
values. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a preferrable route to me, if it's not too much work. A new LocalFlowSocket: Enabled
or something would make sense as the config option name probably?
Co-authored-by: Casey Davenport <[email protected]>
Co-authored-by: Casey Davenport <[email protected]>
|
@tomastigera you are right. It's with one dash. |
@@ -835,6 +835,10 @@ type FelixConfigurationSpec struct { | |||
// FlowLogGoldmaneServer is the flow server endpoint to which flow data should be published. | |||
FlowLogsGoldmaneServer *string `json:"flowLogsGoldmaneServer,omitempty"` | |||
|
|||
// FlowLogsLocalSocket configures local unix socket for reporting flow data from each node. [Default: Disabled] | |||
// +kubebuilder:validation:Enum=Disabled;Enabled | |||
FlowLogsLocalSocket *string `json:"flowLogsLocalSocket,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we should name this FlowLogsLocalReporter
or FlowLogsNodeReporter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense to me - I like the Local
terminology rather than Node
myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mazdakn I think this looks good, but I'd like to clean up some terminology and structure a bit - namely there are references to both NodeReporter and LocalReporter and I think we should standardize on the latter.
Also, its referred to as goldmane
in a few places but I don't think it is tied to Goldmane any more?
felix/collector/dpstatshelper.go
Outdated
log.Info("Adding Flow Logs Aggregator (allowed) for node socket") | ||
fr.AddAggregator(gaa, []string{FlowLogsLocalReporterName}) | ||
log.Info("Creating node socket Aggregator for denied") | ||
gad := flowAggregatorForGoldmane(rules.RuleActionDeny, configParams.FlowLogsCollectorDebugTrace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still find this a bit confusing - this reporter isn't really for goldmane, rather it's for the node socket, right? Is this just a case of an awkward function name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it to defaultFlowAggregator
CI passed in the last run. |
Description
Add a new command to
calico-node
binary to list or watch flow logs from a specific node. This is very useful for troubleshooting. The command line will be:if n is positive then it will watch for n logs and exits. If n is negative, then it watch for flowlogs for ever until interrupted.
In a cluster, it can be used as the following to watch for flowlogs for ever:
Sample output:
To enable reporting flow logs to local node socket, a new Felix config, named
flowLogsLocalReporter
is introduced, which can be set to eitherDisabled
(default) orEnabled
.Related issues/PRs
Todos
Release Note
Reminder for the reviewer
Make sure that this PR has the correct labels and milestone set.
Every PR needs one
docs-*
label.docs-pr-required
: This change requires a change to the documentation that has not been completed yet.docs-completed
: This change has all necessary documentation completed.docs-not-required
: This change has no user-facing impact and requires no docs.Every PR needs one
release-note-*
label.release-note-required
: This PR has user-facing changes. Most PRs should have this label.release-note-not-required
: This PR has no user-facing changes.Other optional labels:
cherry-pick-candidate
: This PR should be cherry-picked to an earlier release. For bug fixes only.needs-operator-pr
: This PR is related to install and requires a corresponding change to the operator.